Loading report..

Highlight Samples

Regex mode off

    Rename Samples

    Click here for bulk input.

    Paste two columns of a tab-delimited table here (eg. from Excel).

    First column should be the old name, second column the new name.

    Regex mode off

      Show / Hide Samples

      Regex mode off

        Explain with AI

        Configure AI settings to get explanations of plots and data in this report.

        Keys entered here will be stored in your browser's local storage. See the docs.


        Anonymize samples off

        Export Plots

        px
        px
        X

        Download the raw data used to create the plots in this report below:

        Note that additional data was saved in multiqc_data when this report was generated.


        Choose Plots

        If you use plots from MultiQC in a publication or presentation, please cite:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411
        Settings are automatically saved. You can also save named configurations below.

        Save Settings

        You can save the toolbox settings for this report to the browser or as a file.


        Load Settings

        Choose a saved report profile from the browser or load from a file:

          Load from File

        Tool Citations

        Please remember to cite the tools that you use in your analysis.

        To help with this, you can download publication details of the tools mentioned in this report:

        About MultiQC

        This report was generated using MultiQC, version 1.30

        You can see a YouTube video describing how to use MultiQC reports here: https://youtu.be/qPbIlO_KWN0

        For more information about MultiQC, including other videos and extensive documentation, please visit http://multiqc.info

        You can report bugs, suggest improvements and find the source code for MultiQC on GitHub: https://github.com/MultiQC/MultiQC

        MultiQC is published in Bioinformatics:

        MultiQC: Summarize analysis results for multiple tools and samples in a single report
        Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller
        Bioinformatics (2016)
        doi: 10.1093/bioinformatics/btw354
        PMID: 27312411

        A modular tool to aggregate results from bioinformatics analyses across many samples into a single report.

        Report generated on 2025-09-15, 13:30 CEST based on data in: /cfs/klemming/projects/supr/uppstore2017170/rawData/hg19/WGS_longread/231004_PacBio_MM_celllines/analysis/sequali

        General Statistics

        Showing 3/3 rows and 4/4 columns.
        Sample NameGC %Mean lengthTotal reads% est. dups.
        pr_023_001_OPM2_hifi_reads.default
        40.46%
        12726.4bp
        6.8M
        0.02%
        pr_023_002_KMS12BM_hifi_reads.default
        40.75%
        12006.8bp
        7.0M
        0.02%
        pr_023_003_MM1S_hifi_reads.default
        40.41%
        12250.3bp
        6.1M
        0.03%

        Sequali

        Version: 1.0.2

        Sequencing quality control for both long-read and short-read data.URL: https://github.com/rhpvorderman/sequaliDOI: 10.1093/bioadv/vbaf010

        Features adapter search, overrepresented sequence analysis and duplication analysis and supports FASTQ and uBAM inputs.

        Sequence Counts

        Sequence counts for each sample. Duplicate read counts are an estimate.

                    This plots shows the total number of reads broken down into 
                    unique and duplicate reads.
        

        The methodology to estimate duplication uses fingerprinting with subsampling based on the fingerprints themselves. This mitigates biases that might occur in estimates that only look at the first reads.

        Sequali fingerprints by combining an 8 bp fragment at an offset of 64 bp from the beginning with an 8 bp fragment offset at 64 bp from the end. The offsets were chosen to limit the chance of adapter sequences contaminating the fingerprint.

        Created with MultiQC

        Sequence Quality Per Position

        The mean quality value across each base position.

        Only mean scores are plotted. The means are approximated as Sequali stores 12 phred categories per position: 0-3, 4-7, etc up to 44 and higher. It does not store all 94 discrete phred score counts for each position. For context, Illumina FASTQ files only utilize four different phred scores.

        As Phred scores are logarithmic, the means are calculated by calculating the probability for each base and then averaging that over the total number of bases. The probability is then converted back into a Phred score. Tools that average Phred scores naively are prone to overestimate the average quality by orders of magnitude. As such Sequali might give a different plot here than other QC tools.

        Created with MultiQC

        Per Sequence Average Quality Scores

        The number of reads with average quality scores.

        Shows the quality score profile on a read level. As Illumina FASTQ files only utilize four different phred scores, the plot may look a bit erratic at times. Due to the logarithmic nature of Phred scores, lower Phred scores have a more significant impact on the average quality as than higher phred scores.

        As Phred scores are logarithmic, the means are calculated by calculating the probability for each base and then averaging that over the total number of bases. The probability is then converted back into a Phred score. Tools that average Phred scores naively are prone to overestimate the average quality by orders of magnitude. As such Sequali might give a different plot here than other QC tools.

        Created with MultiQC

        Per Position GC Content

        The GC content percentage at each position for each sample.

        Created with MultiQC

        Per Sequence GC Content

        The GC content distribution of the sequences for each sample.

        Created with MultiQC

        Sequence Length Distribution

        The distribution of read lengths found.

        Created with MultiQC

        Sequence Duplication Levels

        The relative level of duplication found for every sequence.

        The methodology to estimate duplication uses fingerprinting with subsampling based on the fingerprints themselves. This mitigates biases that might occur in estimates that only look at the first reads.

        Sequali fingerprints by combining an 8 bp fragment at an offset of 64 bp from the beginning with an 8 bp fragment offset at 64 bp from the end. The offsets were chosen to limit the chance of adapter sequences contaminating the fingerprint.

        Created with MultiQC

        Top overrepresented sequences

        The top 20 overrepresented sequences in all libraries

        Showing 3/3 rows and 2/2 columns.
        Sample NameBest MatchLibraries Affected (%)
        AAAAAAAAAAAAAAAAAAAAA
        Poly-A/T repeat. Common pattern in Human Genome.
        100.00%
        ACACACACACACACACACACA
        Poly-CA/GT repeat. Common pattern in Human Genome.
        100.00%
        CACACACACACACACACACAC
        Poly-CA/GT repeat. Common pattern in Human Genome.
        100.00%

        Adapter Content

        The cumulative percentage count of the found adapter sequences

        Note that only samples with >= 0.1% adapter contamination are shown. There may be several adapters detected per sample. For long read data there maybe more adapters per sample, this is a result of the false positive detection rate increasing with longer read length.

        Created with MultiQC

        Software Versions

        Software Versions lists versions of software tools extracted from file contents.

        SoftwareVersion
        Sequali1.0.2